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Abstract 

We  apply  nonparametric  regression  models  to  estimation  of  demand  curves  of  the  type 
most  often  used  in  applied  research.     From  the  demand  curve  estimators  we  derive 
estimates  of  exact  consumers  surplus  and  deadweight  loss,  that  are  the  most  widely  used 
welfare  and  economic  efficiency  measures  in  areas  of  economics  such  as  public  finance. 
We  also  develop  tests  of  the  symmetry  and  downward  sloping  properties  of  compensated 
demand.     We  work  out  asymptotic  normal  sampling  theory  for  kernel  and  series 
nonparametric  estimators,  as  well  as  for  the  parametric  case. 

The  paper  includes  an  application  to  gasoline  demand.     Empirical  questions  of 
interest  here  are  the  shape  of  the  demand  curve  and  the  average  magnitude  of  welfare  loss 
from  a  te«  on  gasoline.     In  this  application  we  compare  parametric  and  nonparametric 
estimates  of  the  demamd  curve,  calculate  exact  and  approximate  measures  of  consumers 
surplus  and  deadweight  loss,  and  give  standard  error  estimates.     We  also  zmalyze  the 
sensitivity  of  the  welfare  measures  to  components  of  nonpeirametric  regression  estimators 
such  as  the  number  of  terms  in  a  series  approximation. 
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1.        Introduction 

Nonparametric  estimation  of  regression  models  has  gained  wide  attention  in  the  past 
few  years  in  econometrics.     Nonparametric  models  are  characterized  by  a  very  large 
numbers  of  parameters.     Often  they  may  be  difficult  to  interpret,   and  their  usefulness  in 
applied  research  has  been  demonstrated  in  a  limited  number  of  cases.     We  apply 
nonparametric  regression  models  to  estimation  of  demand  curves  of  the  type  most  often 
used  in  applied  research.   After  estimation  of  the  demand  curves,  we  will  then  derive 
estimates  of  exact  consumers  surplus  and  deadweight  loss  which  are  the  most  widely  used 
welfare  and  economic  efficiency  measures  in  areas  of  economics  such  as  public  finance. 
We  also  work  out  asymptotic  normal  sampling  theory  for  the  nonparametric  case,  as  well  as 
the  pau-ametric  case  (where  except  in  certain  analytic  cases  the  results  are  not  known). 

The  paper  includes  an  application  to  gasoline  demand.     Empirical  questions  of 
interest  here  are  the  shape  of  the  demand  curve  and  the  average  magnitude  of  welfare  loss 
from  a  taix  on  gasoline.     In  this  application  we  compare  parametric  and  nonparametric 
estimates  of  the  demand  curve,   calculate  exact  and  approximate  measures  of  consumers 
surplus  and  deadweight  loss,   and  given  estimates  of  standard  errors.     We  also  analyze  the 
sensitivity  of  the  welfare  measures  to  components  of  nonparametric  regression  estimators 
such  as  window  width  aind  number  of  terms  in  a  series  approximation. 

The  definition  of  exact  consumers  surplus  is  based  on  the  expenditure  function: 

CS(p  ,p  .u"^)  =  e(p  ,u'^)  -  e(p  ,u'^), 

where  p     are  initial  prices,     p      are  new  prices,  and     u*"     is  the  reference  utility 
level.     The  case  r  =  1  corresponds  to  the  case  we  focus  on  here,  equivalent  variation: 


EV(p  ,p  ,y)  =  e(p  ,u  )  -  e(p  ,u  )  =  y  -  e(p  ,u  ), 


where     y     denotes  income,  fixed  over  the  price  change.     It  is  also  easy  to  carry  out  a 


similar  analysis  for  compensating  variation. 

The  measure  of  exact  deadweight  loss  (DWL)  used  corresponds  to  the  idea  of  the  loss 
in  consumers  surplus  from  imposition  of  a  tax  less  the  compensated  teix  revenues  raised, 
under  the  implicit  assumption  that  they  are  returned  to  the  consumer  in  a  lump  sum 
mamner.     See  Auerbach  (1985)  for  a  discussion  of  the  various  definitions  of  deadweight 
loss  (also  called  the  excess  burden  of  taxation).     Here  we  use  the  Diamond-McFadden 
(1974)  definition  of  deadweight  loss  where     p    -  p       is  the  vector  of  taxes.     Then,  for 
equivalent  variation  the  definition  of  exact  DWL  is: 

DWL(p  ,p  ,y)  =  EV(p  ,p  ,y)  -  (p  -p  )'h(p  ,u  )  =  y  -  e(p  ,u  )  -  (p  -p  )'q(p  ,y), 

where     h(p,u)     is  the  compensated,  or  Hicksian,  demand  function  and     q(p,y)     is  the 
Marshallian  (market)  demand  function.     Thus,  to  estimate  exact  consumers  surplus  or  exact 
DWL  it  is  necessary  to  estimate  the  expenditure  function  and  the  compensated  function 
demand  which  are  related  by  the  equation, 

5e(p,u)/ap.  =  h.(p,u).     (Shephard's  Lemma). 

The  expenditure  function  and  compensated  demand  curve  aire  estimated  from  observable  data 
on  the  market,  or  Marshallian,  demand  curve,     q.(p,y). 

Various  approximations  h"'ve  been  proposed  for  estimation  of  the  exact  welfare 
measures.     Willig  (1976)  demonstrated  that  the  Marshallian  measure  of  consumers  surplus 
is  often  close  to  the  exact  measures,  and  he  derived  bounds  as  a  function  of  the  income 
share.     Various  authors  have  also  recommended  higher  order  Taylor-type  approximations  to 
the  exact  welfare  measures.     Limitations  of  the  approximation  approach  are  that  they  do 
not  yield  measures  of  precision  given  that  they  Eire  based  on  estimated  coefficients  in 
most  cases  and  the  approximations  may  do  poorly  in  some  situations  which  are  difficult  to 
specify  a  priori.     Deaton  (1986)  discusses  these  problems  in  more  detail.     Also,  Hausman 
(1981)  demonstrated  that  while  the  approximations  may  often  do  well  for  the  consumers 


surplus  measure,  they  often  do  very  poorly  in  measurement  of  DWL;   Small  and  Rosen  (1981) 
also  demonstrated  a  similar  proposition  in  the  discrete  choice  situation. 

Two  approaches  have  been  proposed  to  estimation  of  the  exact  welfare  measures  from 
estimation  of  ordinary  demand  functions.     Hausman  (1981)  demonstrated  that  for  many 
widely  used  single  equation  demand  specifications,  the  necessarily  differential  equation 
could  be  integrated  to  derive  the  expenditure  function  and  also  the  compensated  demand 
function.     He  was  also  able  to  derive  the  sampling  distribution  for  the  measures  of 
consumers  surplus  and  DWL,   given  the  distribution  of  the  estimated  demand  coefficients. 
Vartia  (1983),   instead  of  using  an  analytic  solution  to  the  differential  equations, 
proposed  a  variety  of  numerical  algorithms  which  estimate  consumers  surplus  and  DWL  to 
any  desired  degree  of  accuracy.     While  the  Vartia  approach  can  be  applied  to  a  wider 
range  of  situations,   no  correct  sampling  distribution  has  been  derived  for  the  estimated 
welfare  measures,   although  some  approximate  results  are  given  in  Hayes  and  Porter-Hudak 
(1986). 

Both  the  Hausman  approach  and  the  Vartia  approach  can  be  applied  to  multiple  price 
changes  and  are  path  independent  for  the  true  demand  function.     The  various  approximation 
approaches  do  not  share  the  path  independence  property  which  cam  lead  to  perplexing 
computational  results  and  has  led  to  much  theoretical  misunderstanding  in  the  appropriate 
literature.     Here,  we  consider  nonparametric  estimation  via  an  unrestricted  estimator 
that  uses  some,  but  not  all  the  implications  of  path  independence.     Our  approach  allows 
us  to  test  path  independence,   i.e.   symmetry  of  compensated  demands,  and  impose  further 
implications  of  path  independence  to  construct  more  efficient  estimators. 

The  Hausman  and  Vartia  approaches  begin  with  Roy's  identity  which  links  the  ordinary 
demand  function  with  the  indirect  utility  function: 

q.(p,y)  =  -[av(p,y)/ap.]/Ov(p,y)/ayl, 
where  v(p,y)  is  the  indirect  utility  function.     The  partial  differential  equation  from 


this  equation  can  be  solved  along  an  indifference  curve  for  a  unique  solution  so  long  as 
the  initial  values  are  differentiable.     Let     p(t)     denote  a  price  path  with     p(0)  =  p 
and     p(l)  =  p  ,     and  let     y(t)  be  compensated  income,  satisfying     V(p(t),y(t))  =  V(p  ,y). 
Differentiating  with  respect  to     y     gives 

[av(p(t),y(t))/ap]'sp(t)/at  +  [av(p(t),y(t))/ay]sy(t)/at  =  o. 

Hausman  notes  that  this  equation  can  be  converted  into  an  ordinary  differential  equation 
by  use  of  the  implicit  function  theorem  and  Roy's  identity.     Let     S(t,y)  =  y  -  e(p(t),u  ) 
denote  the  equivalent  variation  for  a  price  change  from     p(t)     to     p  .     Then,  since 
compensated  income  is     y  -  S(t,y), 

(1.1)  5S(t.y)/5t  =  -q(p(t),  y-S(t,y))' Sp(t)/at.     S(l,y)  =  0. 

Alternatively,  this  equation  follows  immediately  from  Shephard's  Lemma  and  the  definition 

of     S(t,y).     Hausman  solves  this  equation  for  some  widely  used  demand  curves.     The 

solution  gives  both  the  expenditure  function  and  indirect  utility  function,  while  the 

right-hand  side  of  this  equation  gives  the  compensated  demands. 

Vartia's  numerical  solutions  to  the  differential  equation  also  arise  from  equation 

(1.1).     To  solve  it  Vartia  uses  a  numerical  method  of  Collatz.     He  orders     t     =  1  -  s/N, 

s 

(s  =  0 N),     and  defines     S     iteratively  by 

S(t^^^)  =  S(t^)  -  .5[q(p(t^^j),y-S(t^^j))  +  q(p(t^).y-S(t^))]' [p(t^^^)-p(t^)] 

This  algorithm  consists  of  averaging  the  demand  at  the  last  price  eind  the  current  price 
and  multiply  this  average  by  the  change  in  price.     By  the  envelope  theorem,  the  product 
of  the  price  change  times  the  quantity  equals  the  additional  income  required  to  remain  on 
the  same  indifference  curve.     Intuitively,   dy  =  q*dp,  where  y  is  updated  at  each  step  of 
the  process,  rather  than  holding  y  constant  which  the  Marshallian  approximation  to 
consumer  surplus  does.     One  can  also  use  alternative  numerical  algorithms  to  solve 


equation  (1.1),  that  may  lead  to  faster  methods.     We  use  an  Buerlisch-Stoer  algorithm 
from  Numerical  Recipes  that  does  not  require  solution  to  the  implicit  equation  in 
Vartia's  algorithm,  has  a  faster  (quintic)  convergence  rate  that  Vartia's  (cubic),  and  in 
our  empirical  example  leads  to  small  estimated  errors  with  few  demand  evaluations. 

The  possible  shortcoming  of  all  applications  to  date  of  both  the  Hausman  approach 
and  the  Vartia  approach  is  that  the  parametric  form  of  the  demand  function  is  required. 
In  most  applied  situations,   the  exact  parametric  specification  of  the  demand  curve  (up  to 
unlinown  parameters)  will  not  be  known.     Thus,  commonly  used  demand  curve  specifications 
may  well  lead  to  inconsistent  estimates  of  the  welfcire  measures  if  the  demand  curve  is 
misspecified.     This  problem  is  potentially  quite  important,  especially  in  the  case  of 
measuring  deadweight  loss  which  depends  on  "second  order"  properties  of  the  demand  curve. 
See  Hausman  (1981)  for  a  discussion  of  the  second  order  properties  and  their  effect  on 
estimates  of  the  deadweight  loss. 

Varian  (1982  a,b,c)  has  proposed  an  alternative  approach  based  on  the  revealed 
preference  ideas  of  Samuelson  (1948)  and  Afriat  (1967,   1973)  which  is  nonparametric,  but 
is  only  able  to  estimate  upper  and  lower  bounds  on  the  welfare  measures.     Varian' s 
nonparaimetric  approximation  approach  is  very  interesting,   but  it  often  yields  rather  wide 
bounds,  because  many  price  observations  per  individual  are  required  for  tight  bounds. 
Furthermore,  use  of  sampling  distributions  to  measure  the  precision  of  the  estimates  is 
problematical.     Here,  we  use  a  nonparametric  cross-section  demand  analysis  analysis, 
imposing  enough  homogeneity  across  individuals  and  smoothness  of  the  demand  function  that 
we  can  estimated  it  by  nonparametric  regression.     We  construct  point  estimates  of  exact 
consumers  surplus  and  DWL  as  well  as  precision  estimates  of  the  welfare  measures.     The 
sensitivity  of  the  welfare  measure  to  the  amount  of  smoothing  used  in  the  nonparametric 
regression  can  be  analyzed  straightforwardly. 


2.        Estimation 

Our  estimator  of  consumer  surplus  is  obtained  by  solving  equation  (1.1)  numerically 
with     q(p,y)     replaced  by  an  estimator  obtained  from  nonparametric  regression.     We  will 
also  allow  covariates     w     to  enter  the  demand  function     q(p,y,w).     In  our  empirical  work 
these  covariates  are  region  and  time  dummies.     In  other  contexts  they  might  be 
demographic  variables.     We  will  try  to  minimize  the  dimension  of  this  nonparametric 

function  by  restricting     w     to  enter  in  a  parametric  way.     Let     T(g)     denote  a  one-to-one 

k  g 

function  with  range  equal  to  the  nonnegative  orthant  of     IR  ,     such  as     T(g)  =  e^,     x     a 

one-to-one  function  of     (p,y),     and  let     t(x)     denote  a  "trimming"  function,  to  be 

further  discussed  below.     We  will  assume  that  the  true  value  of  the  demand  function  is 

given  by 

(2.1)  qQ(p.y.w)  =  T(gQ(x,w)),     g^Cx.w)  =  r^Cx)  +  w'p^, 

T"^(q)  =  ggCx.w)  +  e,     E[c|x]  =  0,     E[T(x)we'l  =  0. 

where     q     denotes  observed  quantity  and  the  expectations  are  taken  over  the  distribution 

of  a  single  observation  from  the  data.     We  assume  here  that  a  numeraire  good  has  been 

excluded  from     q,     so  that     k     is  the  number  of  goods,  minus  one. 

Thus,  the  true  demand  function  is  assumed  to  be  a  function  of  a  partially  linear 

specification  for  the  regression  of     T    (q)     on    p,  y,     and     w.       A  corresponding 

estimator  of  the  demajid  function  will  be     q(p,y,w)  =  T(g(x,w))  =  T(r{x)+w'3),     where    g 

is  an  estimator  of     g^^,     such  as  the  kernel  or  series  estimator  discussed  below.     We 

estimate  exact  consumer  surplus  nonparametrically  by  substituting  q(p,y,w)     for     q(p,y) 

This  specification  does  not  restrict  the  joint  distribution  of  the  data,  unlike  previous 
work  on  partially  linear  models  (e.g.   Robinson,   1988),  where     E[e|p,y,w]  =  0     is  imposed. 
A  precise  description  of     g  (x,w),     for  nonnegative     t,     is  as  the  mean-square  projection 

of     E[q|x,w]     on  functions  of  the  form     r(x)  +  w'g     for  the  probability  distribution 
Pr  {id)  =  E[x{x)li4)]/Pr{d),     where     !(•)     denotes  the  indicator  function. 


in  equation  (1.1)  and  solving  numerically.     The  empirical  results  reported  in  this 
version  of  the  paper  set     w     equal  to  its  sample  mean  and     t(x)  =  1. 

A  kernel  estimator  is  a  locally  weighted  average  that  can  be  described  as  follows. 
Let     J<(v)     be  a  kernel  function,  where     v     has     k+1     elements,   satisfying     J'K(v)dv  =  1 

and  other  regularity  conditions  discussed  below,     h  >  0     a  bandwidth  parameter,     and 

-k-1 

K,  (v)  =  h        K(v/h).     Also,   let  the  data  be  denoted  by     z, z  ,  where     z  includes 

h  In 

q,  y,  w,     and  possibly  other  variables.     For  a  matrix  function     B(z),     a  kernel  estimator 
of     E[B(z)|x]     is 


E[B(z)|x]  =  J:.",B(z.)K.  (x-x.)/5:.",K.  (x-x.). 
^j=l      J     h         J    ^j=l   h         J 


To  estimate  the  pau'tially  linear  specification  in  equation  (2.1),  we  "partial  out"  the 
coefficients  of     w     in  a  way  einalogous  to  that  in  Robinson  (1988).     The  estimator  of     g 


IS 


(2.2)  g(x.w)  =  f(x)  +  w'^.     f(x)  =  E:lT"^(q)|x]  -  E[w|x]'p, 

P  =  [Ii"ii^i(  V^^"^ '  ^i^^^  V^^"^ '  ''i^^'  ^'^li^i'^i^  V^fw '  x.l)(T'^q.)-E[T"%)  I  x. )). 


where     x.  =  t(x.).     A  convenient  kernel  that  we  consider  in  the  empirical  work  is 

fed  -  v'v),     v'v  <    I 
(2.3)  K(v)  = 

0  ,  otherwise 


where  the  constcint     C     is  chosen  so     JK(v)dv  =  1.     This  is  a  multivariate  Epanechnikov 
kernel.     The  asymptotic  theory  requires  kernels  with  integrals  over     J<(v)     of  certain 
even  powers  of     v     equal  to  zero,  that  are  often  called  "higher  order,"  and  are  useful 
for  reducing  asymptotic  bias  in  kernel  estimation.     Therefore,  we  also  consider  in  the 
empirical  work  a  kernel  of  the  form 

(2.3a)  Kiv)  =  ^(v^)^{v2)(12  -  6v^  -  6v^  +  Sv^   +  lOv^v^  +  5v^), 


for  our  one  dimensional  price  (k  =  1)  application. 

It  is  well  known  that  the  choice  of  bandwidth     h     can  have  important  effects  on 
nonparametric  regression.     In  the  empirical  work  we  consider  a  data  based  choice  of     h, 
equal  to  a  "plug-in"  value  that  minimizes  estimated  asymptotic  mean  square  error,  and 
also  consider  the  sensitivity  of  the  results  to  the  choice  of  bandwidth.     In  the 
empirical  work  we  also  allow  the  kernel  to  be  data  based  in  that  we  normalize  by  the 
estimated  variance  of  price  and  income,  although  the  theory  does  not  allow  for  data-based 
bandwidth  or  kernel. 

A  series  estimator  is  the  predicted  value  from  a  regression  of  the  log  of  gasoline 
consumption  on  some  approximating  functions  for     p     and     y     and  on     w.     For     x     a 
one-to-one,  smooth  transformation  as  above,  let     <f>   (x)  =  I,  ®(^  j.(x) 0  ^(x))     denote  a 

matrix  of  functions  of     x,     the  idea  being  that     r(x)     is  to  be  approximated  by  linear 

K  K  K 

combinations  of     0   (x).     Let     <f>   (x,w)  =  (<p   (x)'  ,w' )'     and     y  = 

IT.   ,<j)   (x.,w.)0   (x.,w.)']       •  y.   .<b   (x.,w.)T    (q.)     be  the  the  coefficients  from  a 
^1=1  11  11  ''1=1  1     1  ^1 

-1  K 

regression  of     T    (q.)     on     0   (x.,w.).     A  series  estimator  of     g     with     t(x)  =  1     is 


(2.4)  g(x,w)  =  Ax,w)'y  =  ^'^(x)'-;;  +  w'3. 


where     y  =  (i}',^')'      is  partitioned  conformably  with     4>   (x).     This  estimator  can  also  be 
interpreted  as  "partialling  out"     w,     satisfying  equation  (2.2)  with  the  kernel 
conditional  expectation  estimator  replaced  by     E[B(z)|x]  =  4>   (x)'t)_,     where    tj^,     are  the 
coefficients  of  the  least  squares  regression  of     B(z.)     on     <p   (x.). 

Two  important  types  of  approximating  functions  are  power  series  and  regression 
splines.     Power  series  are  formed  by  choosing  the  elements  of    ^   (x)     to  be  products  of 
powers  of  the  individual  components  of     x.     Power  series  aire  easy  to  compute  and  have 
good  approximation  rates  for  smooth  functions,  although  they  are  sensitive  to  outliers 
and  local  behavior  of  the  approximation,  and  can  be  highly  collinear.     The  collinearity 
problem  can  be  overcome  to  some  extent  by  replacing  the  powers  of  individual  components 


with  orthogonal  polynomials,  which  does  not  effect  the  estimator  but  may  lead  to  easier 
computation.     Regression  splines  are  piecewise  polynomials  of  order     <x     with  fixed  join 
points.     For  univariate     x     in     [0,1]     with  evenly  spaced  knots  (i.e.  join  points), 

regression  spline  approximating  functions  are     a.^,(x)  =  x      ,     (j  =  1 a+1), 

(x-(j-<i-l)/(K-<x])  ,     j  i  <x  +  2,     where     (v)     =  l(viO)v  .     For  multivariate     x,     a 
regression  spline  can  be  formed  from  all  cross-products  of  univariate  splines  in  the 
components  of     x.     The  spline  approximation  rate  for  very  smooth  functions  is  not  as  fast 
as  power  series,  but  they  are  less  sensitive  to  outliers.     Unlike  power  series,  the  range 
of     x     must  be  known  in  order  to  place  the  knots.     In  practice,  such  a  known  range  could 
be  constructed  by  dropping  from  the  data  any  observation  where     x     is  not  in  some  known 
range.     Also,  the  power  spline  sequence  above  can  be  highly  collinear,  but  this  problem 
can  be  alleviated  by  replacing  them  with  their  corresponding  B-splines:   e.g.  see  Powell 
(1981). 

Series  estimators  are  sensitive  to  the  numbers  of  terms  in  the  approximation.     In 
the  empirical  work  we  choose  the  number  of  terms  by  cross-validation,  and  also  try 
different  numbers  of  terms  to  see  how  the  results  are  affected. 

Returning  now  to  estimation  of  consumer  surplus,  our  estimator  is  constructed  by 
substituting  the  estimated  demand  function  in  equation  (1.1)  and  integrating  numerically. 
As  in  Section  1,   let     p(t)     be  a  price  path  with     p(0)  =  p       emd     p(l)  =  p  .     Also,   let 
S(y,w,g)     denote  the  solution  to  equation  (1.1)  with  demand  function     q(p,y)  =  T(g(x,w)). 
Then  consumer  surplus  at  particular  income  and  covariate  values     y       and     w 
respectively,  with  a  corresponding  estimator,   is 

(2.5)  Sq  =  S(yQ,WQ,gQ),     S  =  S{yQ,WQ,i). 

where     g     is  a  kernel,  series,  or  other  nonparametric  estimator.     A  corresponding 
deadweight  loss  value  and  associated  estimator  can  be  formed  by  subtracting  the  "tax 
receipts,"  as  in 


(2.6)  Lq  =  Sq  -  (p'-p°)'T(g(p'.yQ.WQ)).     L  =  §  -  (pV)'T(i(p\yQ.WQ)). 


A  summary  measure  for  consumer  surplus  can  be  obtained  by  averaging  over  income 
values.     In  addition,  it  may  sometimes  be  of  interest  to  average  over  different  prices, 
to  reflect  the  fact  that  individuals  face  different  prices.     To  set  up  such  an  average 
let     u     index  price  paths,  so  that     p(t,u)     is  the  price  at     t  e  [0,1]     for  price  path 
u,     with  initial  and  final  prices     p(0,u)     and     p(l,u)     respectively.     Also,   let     z 
denote  a  single  data  observation  that  includes     u     and  values     y       for  income  and     w 
for  covariates.     For  example,     u,     y  ,     and     w       might  be  drawn  (simulated)  from  some 
distribution,  or     y_^     and     w       might  be  the  actual  observations.     Let     S(z,g)     be  the 
solution  to  equation  (1.1)  for  the  price  path    p(u,t)     at     y       aind     w  ,     with  demsind 
q(p,y)  =  T(g(x,w)).     The  average  surplus  and  deadweight  loss  we  consider  are  weighted 
means  across     z,     of  the  form 

(2.7)  Mq  =  E[w(z)S(z,gQ)l/E[w(z)],     A  =  JT^Ij^i^^^i'^^/"'     "  =  L^i^^^^-^^- 

Xq  =  Mq  -  E[u(z){p(l,u)-p(0,u)>'T(gQ(p(l.u),yQ,WQ))]/E[u(z)l, 
A  =  A  -  ir^Ej"^u(z.){p(l,u.)-p(0,u.)}'T(i(p(l,u.).yQ..WQ.))/n. 

It  is  interesting  to  note  that  the  consumer  surplus  estimator  may  converge  faster 
than  the  deadweight  loss  estimator.     In  particular,     S     is  like  an  integral  over  one 

dimension  (the  variable     t     in     p(t)),     and  so  has  a  faster  convergence  rate  thaui     L, 

2 

which  depends  on  the  value  of     g     at  a  particular  point.       Similarly,  average  consumer 

surplus  and  deadweight  loss  may  have  different  convergence  rates.     One  important  case 
where  their  convergence  rates  will  be  the  same,  both  being  the  parametric     l/ViT    rate,   is 

2 
It  is  known  from  the  work  on  semiparametric  estimation  that  integrals  or  averages 

converge  faster  than  pointwise  values,  e.g.  Powell,   Stock,  and  Stoker  (1989).     The  fact 

that  one-dimensional  integrals  converge  faster  than  pointwise  values  has  recently  been 

shown  in  Newey  (1992a)  for  kernel  estimators. 
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when  the  initial  and  final  prices  and  income  have  sufficient  variation.     An  example  would 
be  that  where  the  initial  price  for  each  individual  is  the  price  they  actually  faced,   and 
the  tax  rate  is  the  same  across  individuals.     In  this  case  averaging  will  take  place  over 
all  the  arguments  of  the  nonparametric  estimates,  which  is  known  from  the  semiparametric 
estimation  literature  to  result  in  Vn-consistency  (under  appropriate  regulairity 
conditions). 

So  far,  our  nonpaircLmetric  consumer  surplus  estimators  have  ignored  the  residual     c  = 
T    (q)  -  gp,{p,y,w).     This  approach  is  consistent  with  current  practice  in  applied 
econometrics,  and  is  difficult  to  improve  on  without  more  information  about  the  residual. 
One  can  ignore  the  residual  if  it  is  all  measurement  error  and  not  if  it  contains 
individual  heterogeneity.     Hausman  (1985)  shows  that  it  is  possible  to  separate  out 
measurement  error  and  heterogeneity  in  parametric,  nonlinecir  models,  but  the  aunount  of 
heterogeneity  is  not  identified  in  our  nonparametric,   linear  in  residual  specification. 
Even  when     e     is  all  heterogeneity,   it  may  be  possible  to  interpret  the  demand  function 
as  corresponding  to  a  particular  consumer  type.     Suppose  that     c  =  e(p,y,w,v)     for  some 
function     e(p,y,w,v)     of  prices,  income,   covariates  and  a  taste  variable     v,     where     v 

is  independent  of  price  and  income.     In  general     c(p,y,w,v)     will  depend  on     p     and     y, 

3 
as  shown  by  Brown  smd  Walker  (1990).       Nevertheless,   if     c(p,y,w,v)     is  identically  zero 

for  some  value  of     v     then    g(p,y,w)     can  be  interpreted  as  the  demand  function  for  that 

value  of     V     (e.g.  for     e  =  (r(p,y,w)v).     In  the  rest  of  the  paper  we  stay  with  the 

specification  of  demand  as     T(g   (p,y,w)),     corresponding  to  an  interpretation  of     c     as 

4 
measurement  error  or  to  evaluation  at  a  particular  consumer  type. 

3 
They  showed  that  for     T(q)  =  q     the  residual     c     must  be  functionally  dependent  on     p 


and     y.     Also,  Brown  has  indicated  to  us  in  private  communication  that  the  same  result  is 

tr 

4, 


true  for     T(q)  =  e  . 


An  alternative  approach  recently  suggested  by  Brown  sind  Newey  (1992)  is  to  average 
consumer  surplus  over  different  consumer  types,  when     c(p,y,w,v)     is  an  estimable, 
one-to-one  function  of     v. 
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3.        Asymptotic  Variance  Estimation 

Under  regularity  conditions  given  in  Section  6,  ail  of  the  estimators  will  be 
asymptotically  normal.  To  be  specific,  let  G  denote  any  one  of  the  estimators 
previously  presented.     For  a  kernel  estimator  there  will  be     V^     and     a  i  0     such  that 

(3.1)  Vn<r"-(B  -  0^)  -^  N(0.  V^). 

The  "full-average,"  v^-consistent  case  corresponds  to     a  =  0,     while  in  other  cases  the 

convergence  rate  for     9     will  be     l/iVna-  ),     which  is  slower  than     1/v^     by     o-  — >  0.     For 

a  series  estimator  there  will  be     V       such  that 

n 

(3.2)  •nV"^^^(e  -  e.)  -^  N(0,  1). 

n  O 

In  the  v^-consistent  case  the  series  and  kernel  estimators  will  have  the  same  asymptotic 
variance,  with     V       converging  to     V       from  equation  (3.1).     In  other  cases  the  two 
estimators  generally  will  not  have  the  same  asymptotic  variance,  and  the  series  estimator 
will  satisfy  the  weaker  property  of  equation  (3.2),  that  does  not  specify  a  rate  of 
convergence.     Exact  convergence  rates  for  series  estimators  au~e  not  yet  known,  except 
for  v'iT-consistency,  although  it  is  possible  to  bound  the  convergence  rate.     Despite  this 
lack  of  a  convergence  rate,  equation  (3.2)  can  still  be  used  for  asymptotic  inference. 
For  large  sample  inference,  suppose  that  there  is  an  estimator     V       of     V       in 
equation  (3.2).     If     (V  /V  )        -£-»  1     then  it  follows  by  equation  (3.2)  (and  the  Slutzky 
theorem)  that 

(3.3)  VnW'^^^le  -  G^)  -^  N(0,  1). 

n  0 

Consequently,  a     1-a     large  sample  confidence  interval  will  be 


±  >x    ,Vv  /n 


(3.4)  G  ±  /J    .„.     .    , 

'^a/2        n 
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where     ^    .„     is  the     1  -  <x/2     quantile  of  the  standard  normal  distribution.     One  could 

also  use  equation  (3.4)  to  form  large  sample  hypothesis  tests. 

To  form  large  sample  confidence  intervals,  as  in  equation  (3.4),  an  estimator     V 

of     V       is  needed.     For  kernel  estimators,  one  method  of  forming     V       would  be  to  derive 
n  n 

a  formula  for     V.     and  then  substitute  estimators  for  unknown  quantities  to  form     cr       V  . 
This  procedure  is  not  very  feasible,  because  the  asymptotic  variances  are  quite 
complicated,  as  described  in  Section  6.     Instead  we  use  an  alternative  method,  from  Newey 
(1992a),  that  only  requires  knowing  the  form  of     Q.     For  series  estimators  we  also  use  a 
method  that  just  uses  the  form  of     6.  ,..,,. 

The  asymptotic  variance  estimators  for  kernel  and  series  estimators  have  some  common 
features.     In  each  case, 

(3.5)  V     =  Z^.^jj^./n, 

n       ''1=1  ni 


where  the  estimates     \{i  .     are  constructed  in  the  following  way.     Also,   in  each  case,  the 
variance  will  be  based  on  the  form  of  the  surplus  estimator,  as  in 

(3.6)  G  =  jr^5:."^a(z..i)/n,     u  =  l-^^uiyJ/n, 


where     a(z,g)     is  a  function  of  a  single  observation     z     and  the  partially  linear 
specification     g  =  g(x,w)  =  r(x)  +  w'p,     g     is  the  kernel  or  series  estimator  described 
ecu-lier,   and     w(y)     is  a  weight  function.     The  specification  of     a(z,g)     and     (j(y) 
corresponding  to  each  case  is 


(3.7)  e  =  S:      a(z.,g)  =  S{yQ,WQ,g);      (j(y)  =  1; 


e  =  L:     a(z.,g)  =  S(yQ.WQ.g)  -  (pV)'T(g(p\yQ,WQ)); 


e  =  fi:      a(z.,g)  =  w(y.)S(z.,g); 


e  =  A:     a(z.,g)  =  u(y.){S(z..g)  -  (p(l,u.)-p(0,u.)]'T(g(p(l,u.),y.,w.))>, 
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where     g(p,y,w)     denotes     g{x,w)  =  r(x)  +  w'p     evaluated  at  the     x     that  is  the  image  of 
(p,y).     For  both  kernel  and  series  estimation,     0  .     will  have  two  components,  one  of 
which  accounts  for  the  variability  of     z.     in     a(z.,g)     auid    tj(y.),     and  the  other  for 
the  variability  of     g.     The  first  component  is  the  same  for  both  kernel  and  series,   so 
that  we  can  specify 


(3.8)  ^  .  =  0^  +  !/>?,     0^  =  w  ^a(z.,g)  -  e  -  (J  ^e[w(y.)  -  w], 


ni 

"Z  "-R 

where  aui     n     subscript  on     \jj.     and     0?     is  suppressed  for  notational  convenience.     The 

iji.     term  is  an  asymptotic  approximation  to  the  influence  on     0     of  the     i         observation 

in     0)    y.   ,a(z.,g).     The  term     !L.     will  account  for  the  estimation  of     g.     It  can  be 
'^1=1      r"  1 

th 
constructed  from  an  asymptotic  approximation  to  the  influence  on     6     of  the     i 

observation  in     g,     taking  a  different  form  for  kernel  and  series  estimators. 

For  kernel  estimators  the  idea  for  forming     0  .,     developed  in  Newey  (1992a),  is  to 

differentiate  with  respect  to  the     i        observation  in  the  kernel  estimator.     This 

calculation  amounts  to  a  "delta-method"  for  kernels,  and  leads  to  an  estimator  that  is 

robust  to  heteroskedasticity  and  has  the  same  form,  no  matter  what  the  convergence  rate 

of     G     is.     To  describe  the  estimator,  let     5     denote  a  scalar  and 

(3.9)  h''(x)  =  j:",<T"^(q.)-w.'p}K.  {x-x.)/n,     f(x)  =  j:.",K.  (x-x.)/n, 

J=l  J       J         h        J  ^j=l  h        J 

A.  =  a[n"^j;.",a(z.,{f  +  5K.«-x.)r^<h^  +  S{T~hq.)-vf'.p}KA'-x.)  +  w'.p)/n]/a5|  .   _, 
1  ^j=l      J  1  u        1       h         1  1  5=0 

G^  =  3[Ej=ia(Zj.ip)/n]/ap|p^^,     ip(x,w)  =  E[T"^(q)-w'p|x]  +  w'p 
M  =  5:."^T.(w.-E[w|x.])(w.-E[w|x.])/n, 


0|  =  (J  '{A.  -  L^jA ./n  +  g'^M    T.{w.-E[w|x.])(T  ^q.)-i(x.,w.))>. 


th 
The  term     A.     in  an  asymptotic  approximation  to  the  influence  of  the  i      observation  in 

r(x)     on  the  average  of     a(z.,g),     while  the  second  term  in     ifi.       is  a  fairly  standard 
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delta-method  term  for  estimation  of     p.  ■-■'■.•-' 

For  series  estimators  the  Idea  Is  to  apply  the  "delta-method"  as  If  the  series 
approximation  were  exact.     This  results  in  correct  asymptotic  inference  because  it 
accounts  properly  for  the  variance,  while  the  bias  is  small  under  appropriate  regularity 
conditions.     To  describe  the  estimator,   let 

(3.10)  G^  =  a[E  "  a(z  g  )/n]/arL _;,    g^(x.w)  =  /(x,w)'r. 

^^  =  ir^G^Z"V(x.,w.)[T"^(q.)  -  g(x.,w.)],     Z  =  j:."^^'^(x.,w.)/(x.,w.)'/n. 

ACT 

Here     \Jj.     is  a  standard  "delta-method"  term  for  ordinary  least  squares  estimation  of     r. 

/nCT 

For  either  kernels  or  series,  the  main  difficulty  in  computing     ifi.       is  calculating 

the  derivatives     A.     and     G       or     G  .     For  each  of  the  estimators  described  in  Section 
1 

2,  it  is  possible  to  derive  analytical  expressions  for  these  derivatives,  but  the 
expressions  are  so  complicated  as  to  make  them  almost  useless  for  calculation.     Instead, 
these  derivatives  can  be  calculated  by  numerical  differentiation.     This  calculation  only 
requires  evaluation  of     J]._  a(z.,g)/n     for  many  different  values  of     g,     which  is  quite 
feasible,  peirticularly  for  series  estimators. 

A  procedure  analogous  to  that  for  the  series  estimator  can  be  used  to  construct  a 
consistent  asymptotic  variance  estimator  for  exact  consumer  surplus  for  any  parametric 
specification  of  the  dememd  function.     For  a  parametric  specification,     a(z,9r)     will 
depend  on  the  parameters     -y     of  the  demand  function.     In  this  case     G     = 
9(5]._.a(z.,y)/nl/3y  I    _*     can  be  calculated  by  numerical  differentiation.     Then,  supposing 

1— i  1  0  "*tf 

that     Vn(.y  -  r.)  =  Y.   ,*  (z.)/v'n  +  o  (1)     and  that     *T     is  an  estimator  of     *  (z.),     we 
0        ^1=1         1  p  1  1 

can  form 

(3.11)  ijj.  =  IJK   +  ir^G^*T. 

1  1  1 

Asymptotic  inference  for  parametrically  estimated  exact  consumer  surplus  could  then  be 

carried  out  as  described  above  for     V       as  in  equation  (3.4). 

n 
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4.       Testing  Consumer  Demand  Conditions 

Tests  of  the  downward  slope  and  symmetry  of  Hicksian  (compensated)  demands  provide 
useful  specification  checks  for  consumer  surplus  estimates,  aind  are  of  interest  in  their 
own  right  as  tests  of  consumer  theory.     Here  we  consider  tests  that  are  natural 
by-products  of  consumer  surplus  estimation.     An  implication  of  symmetry  is  that  consumer 
surplus  is  independent  of  of  the  price  path,  which  can  be  tested  by  compairing  estimates 
based  on  different  price  paths.     The  downward  sloping  property  can  be  tested  by  comparing 
the  demand  at  the  new  price  with  compensated  demand  at  the  initial  price,  which  is  easily 
computed  from  the  consumer  surplus  estimate. 

Path  independence  can  be  tested  by  comparing  consumer  surplus  for  the  same  income 
and  covariates  but  different  price  paths.     To  describe  this  test,  let     j     index  a  price 
path    p  (t),     with     p'^(O)  =  p      and    p-^(l)  =  p.     Let     S.     denote  the  equivalent 
variation  estimator  described  above  for  the  price  path    p  (t),     income     y^,     and 
covariates     w  .     An  implication  of  symmetry  of  compensated  demand  is  that  all     S.,     (j  = 
1,  ...,  J)     should  converge  to  the  same  limit.     This  implication  can  be  tested  by 
comparing  the  different  estimators.     A  simple  way  to  construct  this  test  is  by  minimum 
chi-square.     Let     0.     denote  the  corresponding  estimator  from  equation  (3.7),  and 

(4.1)  TT  =  f$         c  V       i    =  f.7,^       ,1,^\'       n  =r  "  *  A' 


n  =  (s, s,)'.    *.  =  (i/»t ip-.y.    Q  =j:.  ,*.*'./n. 

1  J  1  1  1  ^1=1  1   1 


Here     n     is  an  estimator  of  the  joint  asymptotic  variance  of    ft.     Let     e     denote  a     J  x  1 
vector  of     I's.     Then  the  test  statistic  is  given  by 

(4.2)  T  =  n(II  -  S«e)'n"^(S  -  fl'e),     jl  =  (e'n"^e)"^e'n"^fl. 

Under  the  symmetry  hypothesis  the  asymptotic  distribution  of  this  test  statistic  will  be 


X^U-l). 


The  estimator     S     may  be  of  interest  in  its  own  right.     By  the  usual  minimum 
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chi-square  estimation  theory,  under  symmetry  of  compensated  demands  it  will  be  at  least 

as  asymptotically  efficient  as  any  of  the  estimators     S .,     with  an  estimated  asymptotic 

1    -1 

variance  matrix     (e'n    e)     .     This  efficiency  improvement  leads  to  the  question  of  an 

efficiency  bound.     This  question  could  be  answered,   in  part,  by  deriving  the  optimal  way 

to  average  over  all  different  price  paths,  a  question  outside  the  scope  of  this  paper. 

The  downward  slope  of  compensated  demands  can  be  tested  by  testing  for  nonnegativity 

of     (p^-p  )'[T{g^(.p  .y^.w^))  -  T(g|^(p  ,yQ-S{yQ,WQ,gQ),WQ)]     over  several  different  prices, 

incomes,   cind  covariates.     To  describe  this  test,   let     p.,  p.,  y.^.,  w    .,     (j  =  1 J) 

denote  different  values,   let     § .     denote  the  corresponding  equivalent  variation 

estimates,  and 

(4.3)  e^.  =  (p^.  -  p°)'[T(i(p^.,yjQ,w^.Q))  -  T(i(p°.yjQ-Sj..w^.Q))],      (j  =  1 J). 

An  implication  of  convexity  of  the  expenditure  function  is  that  each  of  these  estimators 

should  have  a  nonnegative  limit.     This  hypothesis  can  be  tested  using  an  estimator  of  the 

asymptotic  variance  matrix  of     9  =  (9  ^i^'*     ^^^lI  can  be  constructed  via  the 

approach  of  Section  4.     Let     ijr:     be  as  described  in  equation  (3.7),  for     a(z,g)  = 

(p^.-p°)'[T(g(pJ,y^.Q,w.Q))-T(g(p°,y.Q-S.(y.Q.w.Q,g),w.Q)].     w(z)  =  1     and  let     «  =  Ej^i 

(i^.,...,i/i.)' (i^.,...,i/».)/n.     Alternatively,   if  the  income  values     y.       are  mutually 

distinct  then  the  asymptotic  covariances  between  the     9 .     will  be  zero,  so  that 

n  =  5^._  diag((i/».)  ,...,(01.)   )/n     will  suffice.     The  asymptotic  approximation  to  the 

distribution  of     9     is  then  that     9     is  normal  with  variance     Q/n.     Thus,   the  hypothesis 

that  the  limit  of     9     is  a  nonnegative  vector  can  be  tested  by  applying  multivariate 

tests  of  inequality  restrictions  developed  in  the  statistics  literature.     A  particularly 

simple  test  would  be  to  reject  if     min  .{v^9  ./n..)  ^  k     for  some     k.     The  size  of  this 

J         J     JJ 

test  could  be  calculated  by  simulating  the  distribution  of  the  minimum  of  vector  of  mean 
zero  normals  with  variance  matrix  equal  to  the  correlation  matrix  implied  by     Q. 

It  is  possible  to  combine  these  two  types  of  tests,   of  symmetry  and  of  downward 
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sloping  compensated  demand,   into  a  single  test  of  consumer  demand  theory,  by  "stacking" 
S     and     6     into  a  single  vector     (§',§').     The     0     values  could  be  stacked  in  the 
corresponding  way,  and  an  estimator  of  the  varieince  of     (§',§')'     formed  as  the  average 
outer  product  of  the  stacked  vector  of     ^    values.     Also,  it  would  be  possible  to 
consider  versions  of  these  tests  based  on  the  consumer  surplus  averages  of  equation 
(2.7).     Furthermore,   it  should  be  possible  to  give  these  tests  some  asymptotic  power 
against  any  alternative  to  demand  theory  by  letting     J     grow  with  the  sample  size  in  such 
a  way  that  different  price  paths  and  income  and  covariate  "cover"  all  values  in  their 
support.     These  extensions  are  beyond  the  scope  of  this  paper. 


5.        An  Application  to  Gasoline  Demand 

To  estimate  the  nonparametric  and  parametric  demand  functions  for  gasoline,  we  use 
data  from  the  U.S.   Department  of  Energy.     The  first  three  waves  of  the  data  were 
collected  in  the  Residential  Energy  Consumption  Survey  conducted  for  the  Energy 
Information  Agency  of  the  Department  of  Energy.     Surveys  were  conducted  in  1979,  1980, 
and  1981  at  the  household  level.     Gasoline  consumption  is  kept  by  diary  for  each  month; 
in  our  analysis  we  use  average  household  gallons  consumed  per  month.     The  gasoline  price 
is  the  weighted  average  of  purchase  price  over  a  month.     Note  that  gasoline  prices  were 
quite  high  during  most  of  this  period  in  the  U.S.   because  of  the  second  (Iranian)  oil 
shock.     Gasoline  prices  averaged  between  $1.34-1.46  for  these  3  years  where  we  use  1983 
$.     Income  is  divided  into  12  categories  with  the  highest  category  being  over  $50,000  (in 
1983  $).     Here  we  used  the  conditional  median  for  national  household  income  above 
$50,000.     Lastly  geographical  information  is  given  by  8  census  regions.     Average  driving 
patterns  differ  significantly  across  regions  of  the  U.S..     The  last  three  waves  of  data 
were  collected  by  the  Energy  Information  Agency  in  the  Residential  Transportation  Energy 
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Consumption  Survey  for  the  years  1983,   1985,  and  1988.     Price  and  income  were  collected 
in  the  same  manner  as  the  earlier  surveys.     (The  upper  limit  on  income  changed  in  the 
surveys;  however  the  technique  used  to  estimate  income  in  the  top  category  remained  the 
same)     Note  that  the  (real)  gasoline  price  in  the  U.S.  fell  throughout  this  period  so 
that  by  1988  it  had  decreased  to  levels,  about  $0.83,  approximately  equal  to  prices 
before  the  first  oil  shock  in  1974.     In  the  latter  surveys,   9,  rather  than  8,   census 
divisions  were  used.     Since  we  are  unable  to  map  the  earlier  8  region  breakdown  into  9 
regions,   or  vice-versa,   in  the  empirical  specifications  we  use  different  sets  of 
indicator  variables  depending  on  the  survey  year. 

Overall,  we  have  18,109  observations  which  should  provide  sufficient  observations  to 
do  nonparametric  estimation  and  achieve  fairly  precise  results.     Our  empirical  approach 
is  to  do  both  the  nonparametric  and  parametric  estimation  with  indicator  variables  both 
for  survey  year  and  for  regions.     Thus,  we  have  20  indicator  variables  in  our 
specifications.     In  the  parametric  specifications,   we  allow  for  interactions,  most  of 
which  are  found  to  be  statistically  significant  which  is  to  be  expected  given  our  very 
large  sample.     However,   we  decided  to  use  the  same  set  of  indicator  variables  in  both  the 
nonparametric  and  parametric  specifications  to  make  for  easier  comparisons. 

We  give  four  types  of  nonparametric  estimates,  using  the  Epanechnikov  kernel  from 
equation  (2.3),  the  normal  higher  order  kernel  from  equation  (2.3a),   a  cubic  regression 
spline  with  evenly  spaced  knots,   and  power  series.     We  use  a  log-linear  demand 
specification,  where     T(g)  =  e  .     Also,  the  covariates     w     are     20     indicator  variables 
that  allow  for  different  region  and  survey  year  effects.     In  the  results  we  present  we 
evaluate  demand  and  consumer  surplus  at  a  fixed  value  for     w     equal  to  its  sample  mean. 

We  tried  several  different  bandwidths  for  the  kernel  estimators,   based  on  the 
smoothness  of  the  graphs  discussed  below.     The  Epanechnikov  kernel  estimates  used  a 
bandwidth  of     .82     for  the  coefficients     3     of     w,     and     t(x)  =  1.     The  bandwidths  were 
used  to  form     r(x),     equal  to     h  =  .82,     h  =  .55,   and     h  =  .45.     For  the  normal,  higher 
order  kernel  in  equation  (2.3a),  we  used  bandwidths  of     h  =  .55     and     h  =  .45     for  both 
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the  nonpeirametric  part  and  the  coefficients  of     w. 

For  the  series  estimates,  we  used  cross-validation  to  help  choose  the  number  of 
terms.     Cross-validation  is  the  sum  of  squares  of  prediction  errors  from  predicting  one 
observation  using  coefficients  estimated  from  all  other  observations.     Minimizing 
a  cross-validation  criteria  is  known  to  lead  to  minimum  asymptotic  mean-square  error 
estimates.     Table  5.1  reports  the  criteria  for  splines  and  power  series  that  are  additive 
in  (log)  price  and  income.     No  interactions  were  included  because  they  were  never  found 
to  be  significant,  either  in  a  statistical  sense  or  in  terms  of  lowering  the 
cross-validation  criteria. 


Table  5.1:   Cross-validation  for  Series  Estimators 

Cubic  spline,  evenly  spaced  knots 
Knots  CV 


1 

832 

2 

826 

3 

842 

4 

857 

5 

801 

6 

797 

7 

784 

8 

801 

9 

782 

10 

801 

Power 

series 

Order 

CV 

1 

922 

2 

903 

3 

834 

4 

824 

5 

825 

6 

900 

7 

823 

8 

791 

9 

804 

The  criteria  are  minimized  at     7     or     9     knots  and  8      order  power  series.     The  theory 
suggests  that  one  should  choose  a  number  of  knots  that  is  greater  than  the  mean-square 
error  minimizing  one,  so  the  bias  goes  to  zero  faster  than  the  variance.     For  this  reason 
we  prefer     8     or     9     knots  and  an     8         or     9         order  polynomial.     The  results  for     9 
knots  were  similar  to  those  for     8,     except  that  the  estimated  standard  errors  were  very 
large,  so  we  do  not  report  them.     Instead  we  report  results  for     6,     7,     and     8     knots. 

For  purposes  of  comparison  we  also  report  results  for  standard  parametric  forms  for 
the  demand  function.     The  specification  of  the  demand  function  is: 
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(5.1)  Inq.  =  tIq  +  "njlnp  +  Tl2^ny  +  w'^  +  c, 


where     y     is  household  income,  p     is  gasoline  price,   and     w     are  the  20  region  and  time 
dummies  discussed  above.     The  estimated  income  elasticity  is     .37     with  standard  error 
.01     while  the  estimated  price  elasticity  is     -0.80     with  standard  error     .09. 
To  check  on  the  sensitivity  of  our  estimates,  we  also  estimated  a  "trcinslog"  type  of 
parametric  specification  where  we  allow  for  quadratic  terms  in  log  income  and  log  price 
as  well  as  an  interaction  term  between  log  income  and  log  price.     The  income-price 
interaction  term  has  no  estimated  effect,  but  the  quadratic  terms  do  have  an  effect  on 
the  estimates.     The  sum  of  squared  errors  decreases  from     8900.1     to     8877.0     which  is  a 
decrease  of  0.26%,  but  a  traditional  F  statistic  is  calculated  to  be  16.1  with  three 
degrees  of  freedom  due  to  the  large  sample  size.     The  estimated  elasticities  for  the 
log-linear  and  log-quadratic  model  are  quite  similar  at  the  median  gasoline  prices, 
$1.23:     the  log-linear  price  elasticity  is  -0.81  at  all  income  levels  while  the 
log-quadratic  model  price  elasticity  is  approximately  -0.87  with  very  little  variation 
across  income  levels.     The  two  specifications  do  have  different  elasticities  at  lower  and 
higher  gasoline  prices.     The  log  linear  model  price  elasticity,  since  it  is  estimated  as 
a  single  parameter,  remains  at  -0.81  across  all  gasoline  prices  while  the  log-quadratic 
model,   which  has  a  variable  price  elasticity,  has  an  estimated  elasticity  of 
approximately  -0.64  for  gasoline  price  of  $1.08  (the  first  quartile  price)  while  the 
log-quadratic  model  has  an  estimated  elasticity  of  -1.14  at  a  gasoline  price  of  $1.43 
(the  third  quartile).     Since  the  results  for  log  quadratic  translog  specification  are 
only  approximately  similar  to  the  simpler  liner  specification,  we  present  results  for 
both  of  the  parametric  specifications  in  what  follows. 

Figures  1-3  show  the  estimated  nonparametric  log  demand  with  respect  to  log  price, 
evaluated  at  mean  income,  for  the  parametric,   Epanechnikov  kernel,  and  spline 
specifications.     We  do  not  give  graphs  for  other  income  values  because  the  graphs  have 
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similar  shape  and  the  spline  results  discussed  below  strongly  suggest  that  the  log  demand 
function  is  additive  in  log  price  aind  log  income,  in  which  case  the  shape  of  log  demand 
will  not  depend  on  income.     There  are  interesting  differences  between  the  parametric  and 
nonparametric  estimates,  with  the  nonparametric  estimates  having  a  much  more  complicated 
shape  than  the  parametric  ones.     In  Figure  2  we  find  kernel  demand  curves  that  are 
generally  downward  sloping  over  the  range  of  the  data.     In  our  opinion  the  demand  curve 
for     h  =  .82     looks  "too  smooth,"  and  the  one  for     h  =  .45     looks  "too  rough"  so  in  the 
subsequent  analysis  we  will  use     h  =  .55     as  our  preferred  bandwidth.     The  shape  of  the 
spline  demand  function  is  qualitatively  similar  to  that  the  kernel  estimate,  although  the 
demand  function  does  slope  up  very  slightly  at  some  points  on  the  graph.     We  tested 
whether  this  upward  slope  indicates  a  failure  of  demand  theory  using  the  test  for  of 
downward  sloping  compensated  demand  described  above.     For     8  knots  (our  preferred  number) 
a  price  chauige  from     $1.39  -  1.46,     which  is  the  range  over  which  the  demand  curve  slopes 
up,  our     N(0,1)  statistic     .90.     This  value  is  not  statistically  positive  at  any 
conventional  (one-sided)  critical  value. 

We  next  estimate  the  exact  consumers  surplus,  here  the  equivalent  variation,  across 
our  different  estimated  demand  curves.     We  consider  two  sets  of  price  changes  for 
gasoline:     an  increase  from  $1.00  to  $1.30  (in  1983  $)  per  gallon  and  an  increase  from 
$1.00  to  $1.50  per  gallon.     The  starting  price  of     $1.00     corresponds  roughly  to     1992 
gasoline  prices  and  a     50       cent  increase  is  well  within  the  range  of  the  data.     We 
estimate  the  equivalent  variation  for  these  prices  changes  at  the  median  of  income.     We 
present  the  estimates  in  Table  5.2.     Estimated  standard  errors  are  given  in  parentheses, 
and  were  calculated  using  the  formulas  given  earlier. 


The  conventional  significance  levels  may  not  be  appropriate  here,  because  we  have  chosen 
the  interval  based  on  the  estimated  demand  function.     However,  the  conventional 
critical  values  should  provide  a  bound  when  the  test  statistic  is  maximized  over 
choice  of  interval,  with  our  test  being  an  approximate  maximum. 
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Table  5.2:     Yearly  Equivalent  Variation  Estimates 

Parametric  Estimates  $1.00-1.30  $1.00-1.50 

1.  Log  linear  model  282.34  442.00 

(  2.07)  (  2.31) 

2.  Translog  quadratic  model  285.44  444.16 

(  3.03)  (  3.90) 

Epanechnikov  Kernel  Estimates 

3.  h  =  .45  $281.31  $445.09 

(  4.36)  (  4.46) 

4.  h  =  .55  $284.00  $447.41 

(  4.11)  (  4.11) 

5.  h  =  .82  279.87  445.42 

(  3.58)  (  3.59) 

Normal,  Higher  Order  Kernel  Estimates 

6.  h  =  .45  286.44  450.14 

(6.56)  (  7.07) 

7.  h  =   .55  284.36  445.35 

(  5.35)  (  6.34) 

Cubic  Spline  Estimates 

8.  6  knots  284.58  444.63 

(  4.93)  (   6.31) 

9.  7  knots  282.30  441.72 

(  4.68)  (  5.82) 

10.  8  knots  287.12  447.80 

(  4.77)  (  6.04) 

Power  Series  Estimates 

11.  8th  order  287.31  448.64 

(  4.76)  (  5.91) 

12.  9th  order  287.27  448.55 

(  4.75)  (  5.88) 

Note  that  all  the  nonparametric  estimates  are  quite  close.     A  choice  of  different 
nonparametric  estimator  assumptions  leads  to  virtually  the  same  welfare  estimates. 
Surprisingly,  although  the  graphs  show  quite  different  shapes  for  parametric  and 
nonparametric  estimates,  the  welfare  estimates  are  quite  similar.     A  Hausman  test 
statistic  based  on  the  difference  of  the  kernel  estimate  and  the  difference  of  their 
estimated  variances,  which  is  valid  if  the  disturbance  is  homoskedastic,   is     1.3,     which 
is  not  significant  for  a  two-tailed  tests  based  on  the  normal  distribution. 
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We  also  estimate  the  deadweight  loss  from  a  rise  in  the  gasoline  tax  of  either  $0.30 
or  $0.50  which  would  induce  the  corresponding  rise  in  gasoline  prices.     We  base  our 
estimate  of  deadweight  loss  on  the  equivalent  variation  measure  of  the  compensating 
variation.     The  results  are  given  in  Table  5.3: 


Table  5.3:     Yearly  Average  Deadweight 

Loss  Estimates 

Parametric  Estimates 

$1.00-1.30 

1.     Log  linear  model 

26.92 
(  3.13) 

2.     Translog  quadratic  model 

27.07 
(  3.15) 

Epanechnikov  kernel  estimates 

3.     h  =  .45 

$27.70 
(5.65) 

4.     h  =  .55 

27.94 
(5.01) 

5.     h  =  .82 

22.89 
(3.88) 

Normal,  higher  order  kernel  estimates 

6.     h  =  .45 

$36.38 
(6.47) 

7.     h  =  .55 

$31.62 
(5.72) 

Cubic  Spline  Estimates 

8.     6  knots. 

28.60 
(5.03) 

9.     7  knots 

38.68 
(5.37) 

10.   8  knots 

35.72 
(5.27) 

Power  Series  Estimates 

11.   8th  order 

34.92 
(5.21) 

12.   9th  order 

34.86 
(5.38) 

$1.00-1.50 

62.50 
(  6.77) 

75.30 
(  7.24) 


$33.41 
(8.68) 

36.90 
(7.97) 

37.10 
(6.80) 


$33.09 
(14.06) 

47.61 
(8.92) 


51.05 
(8.65) 

46.84 
(8.77) 

46.95 
(8.94) 


48.66 
(8.80) 

48.74 
(8.86) 


The  estimates  of  DWL  are  very  similar  across  bandwidths  for  the  kernel  estimator,  but  are 
sensitive  to  the  use  of  a  higher  order  kernel.     We  give  more  credence  to  the  higher  order 
kernel  estimates,  because  of  their  theoretically  better  property  of  having  smaller  bias 
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relative  to  mean-square  error,   and  their  similarity  to  the  spline  estimates.     The  spline 
results  are  sensitive  to  the  number  of  knots.     We  prefer  the  results  for  the  larger 
number  of  knots,  because  they  are  theoretically  preferred  and  the  results  seem  to  be  less 
sensitive  to  choice  between     7     and     8     than  to     6     and     7. 

We  do  find  rather  large  differences  between  the  nonparametric  and  parametric 
estimates  of  deadweight  loss.     The  estimated  differences  between  the  nonparametric 
estimates  and  the  log  linear  parametric  estimates  are  in  the  range  of  40-507..     In 
particular,  the  nonparametric  estimates  seem  to  be  somewhat  smaller  for  the  larger  price 
change  and  larger  for  the  smaller  price  change.     These  are  economically  significant 
differences,   with  the  ratio  of  DWL  to  tax  revenue  varying  widely  between  parametric  and 
nonparametric  specifications.     The  differences  are  also  statistically  significant:   A 
Hausman  test  of  based  on  the  log-linear  and  normal  kernel  specification  is     -10.14     for 
the  larger  price  change  and  bandwidth     .55     and  is     5.18     for  the  smaller  price  change 
and  bandwidth     .45.   Thus,   efficiency  decisions  on  which  commodities  to  tax  and  the  size 
of  the  tax  might  well  depend  on  the  rather  sizable  differences  in  the  estimated 
efficiency  cost  changes  from  the  increased  taxes. 


'!'),' 
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6.        Asymptotic  Distribution  Theory 

In  this  section  we  give  results  for  the  kernel  estimator  of  consumer  surplus  at 
particular  income  and  covariate  values,  and  for  power  series.     The  conclusions  for  other 
cases  will  be  described,   without  full  sets  of  regularity  conditions.     These  results  show 
that  the  standard  error  formulas  given  earlier  are  correct,  and  describe  the  asymptotic 
properties  of  the  estimators,   without  overburdening  the  reader. 

Precise  conditions  are  useful  for  stating  the  results.     Let     p  (t,u)  =  3p(t,u)/St 
and     g(p,y,w)     denote     r(x(p,y))  +  w'p.     Let     11     be  the  support  of  u,     T  =  p([0,llx1i)     be 
the  image  of     p(t,u)     on     [O.llxli,     and     W     be  the  support  of     w.     Also,   let     V  =  [y,y] 
be  a  set  of     y     values  that  includes  those  where  the  demand  function  is  evaluated  in 

solving  equation  (LI).     The  conditions  will  involve  supremum  norms  for  the  demand 

i  i 

function  over  the  set     Z  =  TxyxW.     Let     llgll .  =  sup     _  »    .119  g(p,y,w)/5(p,y,w)  II,     where 

J  Z€£.,c — J 

for  a  matrix  function     B(z),     d  B(z)/5z      denotes  any  vector  of  all  distinct     £         order 
partial  derivatives  of  all  elements  of     B(z).     Let  "a.e."  denote  almost  everywhere  with 
respect  to  Lebesgue  measure. 

Assumption  1:     W,  11,  and     V     are  compact,     Z     is  contained  in  the  support  of     z.,      Hgn"?  ^ 
00,     T(g)     and     x(p,y)     are  one-to-one  and  three  times  continuously  differentiable  with 
nonsingular  Jacobians  on  their  respective  domains,     p(t,u)     is  twice  continuously 
differentiable  on     [O.llxlJ,     and     T     does  not  include  zero.     Also,     u{y)     is  bounded  and 
continuous  a.e.   and  for     C  =  sup.^  ^,   ^^^^^|T(gQ(p(t,u),y,w))'p^(t,u)  j ,     w(y}  =  0     for     y 
outside     "y     =  [Y+C+e,y-C-c],     for  some     e  >  0. 

We  will  derive  the  results  under  an  i.i.d.  assumption  and  certain  moment  conditions 
specified  in  the  following  result.     Let     t(x)     be  a  trimming  function  that  will  be 
identically  equal  to  one  for  series  estimators. 
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-1         4 
Assumption  2:     z.     is  i.i.d.,     E[IIT    (q)ll    ]  <  oo,     x     is  continuously  distributed  with 

bounded  density     ^q^x),     and     E(t(x)(w-E[w|x])(w-E[w|x])' ]     is  nonsingular. 

Assumptions  1  and  2  are  useful  for  both  the  kernel  and  series  results.     Following  the 
earlier  format,  we  will  first  give  results  for  kernel  estimators.     The  next  three 
assumptions  are  more  or  less  standard  conditions  for  kernel  estimators. 

-1        4 

Assumption  3:     E[IIT    (q)ll    |x]f  (x)     is  bounded,     t(x)     is  bounded,  bounded  away  from  zero 

on     z  e  Z,     and  zero  outside  a  compact  set  on  which     fp,(x)     is  positive. 

Assumption  4:     There  is  a  positive  integer     s     such  that     X(u)     is  twice  continuously 
differentiable,   with  Lipschitz  derivatives,     K(u)     is  zero  outside  a  bounded  set,     JKtujdu 
=  1,  and  for  all  j  <  s,     J"K(u)[®.%]du  =  0. 

Assumption  5:     There  is  a  nonnegative  integer     d     and  extensions  of     E(T    (q)|x]     and 

k+1 
E[w|x]     to  all  of     R  such  that     f^v^x),     f   (x)r_(x),     and     f   (x)E[w|x]     are  continuously 

k+1 

differentiable  to  order     d  s  s  +  2     on     R       . 

Assumption  4  requires  that  the  kernel  be  a  higher  order,  bias-reducing  type.     An  example 
of  such  a  kernel  is  the  Gaussian  one  used  in  the  application. 

To  describe  the  result  for  the  kernel  estimator  of  consumer  surplus  at  a  particular 
income  and  covariate  value,   it  is  necessary  to  introduce  a  little  more  notation.     Let 
S(t)     denote  the  solution  to  equation  (1.1)  at  the  truth,   i.e.   the  solution  to     dS/dt  = 
-T(gQ(p(t),yQ-S(t),WQ))'p^{t),     S(l)  =  0,     where     p^(t)  =  5p(t)/at.      Let     x(t)  = 
x(p(t),y  -S(t)),     and  partition     x(t)  =  (x  (t),x  (t)' )' ,     where     x  (t)     is  a  scalar. 

Let 

.t 


<(t)  =  C(t)«exp{-r  C(v)'[ag„(p(v),y  -S(v),w   ))/ay]dvK 

•'o 
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e(t)  =  T  (gQ(p(t).yQ-S(t).WQ))'p^(t). 


Let     t(T)     be  the  inverse  function  of     x  (t)     (which  will  exist  by  Assumption  6  below), 
x(t)  =  x(t(T))  =  (t.x^U))'.     and     C(t)  =  CWt)). 

Assumption  6:  i)  There  is  7)  >  0  such  that  p{t)  can  be  extended  to  a  function  with 
domain  D  =  [-tj.I+t)]  such  that  x  (t)  is  one-to-one  with  derivative  bounded  away  from 
zero;     ii)  E[ee'  |x]     is  continuous  a.e.   and  for  some     t)  >  0,     aind     {0,y' )     partitioned 

conformably  with     x(t)  =  (t,x  (t)'  ), 

4 
S    -„><sup  (1+E[llell    |x=x(T)+(0,3r))f   (x(T)+(0,r))>dT  <  m.     iii)     f       is  bounded  away 

from  zero  on     x(p(D),[y,y]),     for     x     ^^^     y     from  Assumption  1. 


Note  that     x„(t)     is  differentiable  by  the  inverse  function  theorem  and  the  chain  rule, 

2 
and  let     K{r)  =  /[JKCv.u+lSx  (T)/5T]v)dv]  du.     The  asymptotic  variance  of  consumer 

surplus  at  a  point  will  be 

Vq  =  Xl(0£t(T):£l)K(T)fQ(x(T)r^  I  St(T)/St  I  ^E[{<:(t)'  e}^  I  X=x(T)]dT. 


Let     1(A)     denote  the  indicator  function  for  the  set     A.     The  following  result  gives  the 
asymptotic  distribution  of  the  kernel  estimator  of  consumer  surplus  evaluated  at  a 
particular  income  and  covariate  value. 


Theorem  1:     If  Assumptions  1-6  are  satisfied,  for     w(y)  =  l(y  =  y^),     c  =  (r(n) 
with     na-^^^^/llnCn)]^  -^  oo,     and     n<r^^  -^  0,     then     Vnc^^^CS-S^)  -S  N(0.  V^).     If  in 
addition     n(r         /ln(n)  — >  m,     then  for     V     in  equation  (3.5),     a-  V  — ^  K_. 


For  our  application,  where     k  =  1,     the  conditions  on  the  bandwidth     <r     are  that 

ncr    /ln(n)  — >  m     and     no-      — >  0.     These  conditions  require  that     s  >  5,     i.e.  that  the 

kernel  be  at  least  sixth  order.     The  normal  kernel  used  in  the  application  is  such  a 
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sixth  order  kernel. 

The  asymptotic  distribution  of  a  kernel  estimator  of  deadweight  loss  at  a  particular 
income  and  covariate  value  is  straightforward  to  derive.     Because  the  "tax  receipts"  term 
(p -p  )'T(g(p  .y^.w-))     depends  on  the  demand  function  evaluated  at  a  particular  point, 
it  will  have  a  slower  convergence  rate  tham  the  consumer  surplus  "integral,"  and  hence 
will  dominate  the  asymptotic  distribution.     Then  standard  results  on  pointwise 
convergence  of  kernel  regression  estimators  can  be  applied  to  obtain,  for     x    =  x(p  .y^,), 

(k+l)/2,,~       ,     ,     d     .,,_,    ,.   . 
n  (L  -  Lq)  — >  N{0,  Vq), 

Vq  =  [J-J<(u)^du]fQ(x^)"^-   E[{(p'-p°)'T  {g^{x\w^))c)^\x=x\ 

Also,   it  is  straightforward  to  show  consistency  of  the  asymptotic  variance  estimator,  by 
means  like  those  used  to  prove  Theorem  1. 

As  previously  noted,  average  consumer  surplus  and  deadweight  loss  will  be 
^-consistent  if  initial  and  final  prices  are  allowed  to  vary.     In  this  case  the 
asymptotic  distribution  will  be  the  same  for  both  kernel  and  series  estimators.     This 
asymptotic  distribution  will  be  described  below. 

Some  of  the  conditions  need  to  be  modified  for  series  estimators. 


Assumption  7:     E[e.c'.  |x.,w.]     is  bounded  and  has  smallest  eigenvalue  that  is  bounded  away 
from  zero. 


Let     llgll  .     be  as  defined  above  except  that  the  supremum  is  taken  over  the  support  of 
(x,w),     amd  let     X     denote  the  support  of     x. 

Assumption  8:   <p,     {x)     consists  of  products  of  powers  of  the  elements  of  x     that  are 

nondecreasing  in  order  as     K     increases,   with  all  terms  of  a  given  order  included  before 

the  order  is  increased,     X     is  a  compact  rectangle  and  the  density  of     x  is  bounded  away 
from  zero  on     X. 
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The  condition  that  the  density  is  bounded  away  from  zero  is  useful  for  controlling  the 
variance  of  a  series  estimator.     The  next  condition  is  useful  for  controlling  the  bias. 


Assumption  9:     r  (x)     and     E[w|x]     are  continuously  differentiable  of  all  orders  on     X 
and  there  is  a  constant     C     such  that  for  all  integers     d     the  partial  derivatives  of 
order     d     are  bounded  in  absolute  value  by     C       on     X. 


This  smoothness  condition  is  undoubtedly  stronger  than  necessary.     It  is  used  in  order  to 

apply  second  order  Sobolev  norm  approximation  rates  (i.e.   approximation  of  the  function 

and  derivatives  up  to  order     2),  where  a  literature  search  has  not  yet  revealed  such 

approximation  rates  for  power  series  except  under  this  hypothesis.     Also,  results  for 

regression  splines  are  not  given  here  because  multivariate  Sobolev  approximation  rates  do 

not  seem  to  be  readily  available  for  them. 

Average  equivalent  variation  and  deadweight  loss  will  be  ^-consistent  under 

certain  conditions  that  we  now  describe.     Let     t.     denote  a  random  variable  that  is 

1 

uniformly  distributed  on     [0,1]     and  independent  of     z.,     p.  =  p(t.,u.),     y.  = 
y.-S(t.,z.,g   ),     and     x.  =  x{p.,y.).     Let     S(t,z)  =  S(t,z,g   )     and 


C(t,z)  =  C(t.z)«exp{-r  C(v,z)'gQ   (p(v,u),y-S(v,z),w))dv>, 


?(t,z)  =  T  (gQ(p(t,u),y-S(t,z),w))'p^(t,u). 


Assumption  10:     Conditional  on     w,     x     is  continuously  distributed  with  bounded  density 
f  (x|w)     and  for  the  density     f(x|w)     of     x.     given     w,     a(x,w)  = 
E[tj(y.)^(t.,z.)|x.=x,w.=w]     is  zero  outside  the  support  of     f(x|w),     and 
f(x|w)    a(x,w)     is  bounded. 


Let     w  =  E[a)(y.)]     and 


C'^(x,w)  =  f(x|w)  ^f'^(x|w)E[tj(y.)C(t.,z.)|x.=x,w.=w]. 
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C^(x,w)  =  E[C^(x,w)|x]  +  (w-E[w|x])'M  ^E[(w-E[w|x])' C^Cx.w)]. 
\l/j  =  u    u(y.)S(0,z.,g   )  -^1+0)    ^l  l(j}(.y.)  -  w]  +  w"  c'^(x.,w.)'c.. 

Then  the  asymptotic  variance  of     ^i     will  be     E[\Jf:*lr.'  ]. 

Under  an  additional  condition  we  can  also  derive  the  asymptotic  variance  of 
deadweight  loss.     We  consider  here  only  the  case  where  there  is  sufficient  variation  in 
the  final  price  to  achieve  Vn-consistency.     The  next  assumptions  embodies  this 
requirement,  by  the  condition  that  the  final  price  is  continuously  distributed. 

Assumption  11:     Conditional  on     w,     x.  =  x(p  {u.),y.)     is  continuously  distributed  with 
bounded  density     f   (x|w),     and     cj(y)f   (x|w)     is  zero  outside  the  support  of     f(x|w). 


Let     T.   =  cj  ^u(y.)(p^-p°)'T(g(p^(u.),y.)))     and 

cV.w)  =  f(x|wr^f'^(x|w)E[cj(y.)(p^-p°)'T  (g-(x..w.))  |  x.=x,w.=w]: 

1  g     0     1      1         1  1 

C^x.w)  =  E[C^x,w)|x]  +  (w-E[w|x])'m"^E[(w-E[w|x])'c'^(x.w)]. 

/  =  i//*^  -  {T.  -  E[T.]  -  E[T.]w"^[(j(y.)  -  w]  +  l^^^\x.,^N.y  c). 
11  1  1  1  1  111 

L  2 
Then  the  asymptotic  variance  of     L     will  be     E[(i/».)   ]. 

The  next  result  shows  asymptotic  normality  of  the  power  series  estimators. 

Theorem  2:     Suppose  that  Assumptions  1  -  2,  7  -  9  are  satisfied,  with     t(x)  =  1,     and     K 

11  —X  ^        -  - 

=  K(n)     satisfies     K    /n  — >  0     and     Kn       — >  m     for  some     y  >  0.     Then  for     6  =  5     or     6 

L,     equation  (3.3)  will   be  satisfied  and     9  =  9+0   (K/-/n).     If  Assumption   10  is  also 

satisfied  and     V^  =  E[(\l^J^]  >  0.   then     VR(ii  -  ti^)  -^  N(0,V^)     and     V  -^  V^.     If 

Assumption  11  is  also  satisfied  and     V     =  E[(tp.)  ]  >  0,     then     VrKX  -  X  )  — >  N(0,V^) 

and     V  -^  V  . 
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Appendix:     Proofs  of  Theorems 

Throughout  the  Appendix     C     will  denote  a  generic  positive  constant,   that  may  be 
different  in  different  uses. 

One  intermediate  result  that  is  needed  for  both  kernel  and  series  estimators  is  a 
linearization  of  the  solution  to  equation  (1.1)  around  the  true  demand  function.     Some 
additional  notation  is  needed  to  set  up  this  linearization.     Let     z     be  a  data  observation 
that  includes     u     and     S(t,z,g)     denote  the  corresponding  solution  to  equation  (1.1),  i.e. 
the  solution  to     5S/5t  =  -T(g(p(t,u),y-S,w)).     Let     g  (p,y,w)     denote  the  derivative  of 
g{p.yiw)     with  respect  to     y,     and 

r 

'0 


(6.1)  C(t,z,g)  =  C(t.z,g)-exp{-f  C(v,z,g)'g     (p(v,u).y-S(v,z,g),w))dv}. 


^(t.z.g)  =  T  (g(p(t,u),y-S(t,z,g),w))'p^(t,u). 


Under  conditions  specified  below,   when     g     is  near  the  truth     g   ,     equivalent  variation 
can  be  approximated  by  the  linear  functional 

_.l 

(6.2)  A(z  _  I 

*0 


,g;g)  =   r  g(p(t,u),y-S(t,z,g).w)'C(t,z,g)dt, 
J  n 


evaluated  at     g  =  g^. 

Lemma  Al:     If  Assumption  1  is  satisfied  then  there  is  an     c  >  0     and  a  constant     C    such 
that  for  all     z  e  Z  ,   Wg-g  W     <  e,  and     \\g-gA\     <  c,     it  is  the  case  that 
\oi(y)S(0,z,g)-u(y)S(0.z,g)-u)(y)h(z.g-g;g)\    ^  C\\g-g\\^\lg-g\\^,      | w('yM('z,g;g^; |    s  CWgW^, 
\(ji(y)S(0,z,g)-u)(y)S(0.z,g^)\    ^  CWg-g^W^.     and  for  any     g     with     llgll^  <  oo, 
\u)(y)A(z,'g;g)-o>(yWz.g;gQ)\    ^  C(\\g\\^\\g-g^\\^  +   WlW^Wg-g^W^). 

Proof  of  Lemma  Al:     By  Assumption  1,   it  suffices  to  prove  the  result  when     w(y)  =  1     for  all 
y  e  y  .     We  use  standard  results  on  existence  and  continuity  of  solutions  of  differential 
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equations,  e.g.   in  Finney  and  Ostberg  (1976).     Let     q(t,y,z,g)  =  T{g{p(t,u),y,w))'p  (t,u). 
By     T(g)     thrice  continuously  differentiable,  its  derivative  up  to  order     2     are  bounded  and 
Lipschitz  on  any  bounded  set.     Then  by     p  (t,u)     bounded,  we  can  choose     e     so  that  for 
llg-gQiL  <  5,      llg-g^ll^  <  5,     and     5     small  enough, 

(A.l)  supjQ^j^y^2l^"^'^^^'y'^>g^/^y"^  -  3Jq(t,y,z,g)/5yJ|    s  Clli-gll^..     (j  =  0,   1,  2). 

In  particular,  for     5     small  enough,     sup.       ,   ^   .,  |q(t,y,z,g)  I    s  C+e.     Also,  by  construction 
of     y  ,     (p(t,u),y-s)  e  TxV     for     (t,u,y,s)  e  [dlxl/xy  x[-C,C].     Then  by  Theorem  12-6  of 
Finney  and  Ostberg  (1976),   there  exists  a  solution     S(t,z,g)     to     5S(t,z,g)/3t  = 
-q(t,y-S(t,z,g),z,g),     S(l,z,g)  =  0,     for     t  e  [0,1],     z  e  Z  ,     and     y     now  included  in     z. 
Furthermore,  by  integration  of  equation  (1.1)  on  t  €  [0,1],      |S(t,z,g)|    <  C+e,     so  that 
y-S(t,z,g)  e  y     for     z  e  Z  .     Also,  the  same  existence  and  boundedness  properties  hold  for 
g     replacing     g     in     q     and     S. 

Next,      llgll      <  00     by      llgQll2  <  "     and     "g-gglU  <  ^-     Also,   by     p(t,u)  e  T     and 
y-S(t,z,g)  €  y     for     t  €  [0,1],     and     z  e  Z   ,      it  follows  that     g(p(t,u),y-S(t,z,g),w)     and 
g  (p(t,u),y-S(t,z,g),w)     are  bounded  on  this  set.     Then  boundedness  of     ^(t,z,g)     follows  by 
by     T  (g)     bounded  on  any  compact  set  and  boundedness  of     p  (t,u),     giving  the  second 
conclusion. 

Next,   it  follows  by  Theorem  12-9  (equation  12-22)  of  Finney  and  Ostberg  (1976),     T(g) 
Lipschitz  on  any  bounded  set,     p  (t,u)     bounded,  and  eq.   (A.l)  that 


(A.2)        sup^g[o,i],zeZ    lS(t.z,i)-S(t,z,g)  |    ^  Clli-gll^, 


which  implies  the  third  conclusion. 

Next,  for  all     t  e  [0,1],  z  6  Z   ,     by      |S(t,z,g)|    s  C+c     and      |S(t,z,g   )|    <  C,     it 
follows  that      llg(p(t,u),y-S(t,z,g),w)ll    ^   "gUn'     ^"'^  ^^  ^  mean-value  expansion  and  eq. 
(A.2)  that      lli(p(t,u),y-S(t,z,g),w)  -  i(p(t,u),y-S(t,z,gQ),w)ll    < 
llg  {p(t,u),y-S(t,z,g,gQ),w)ll|S(t,z,g)-S(t,z,gQ)|    :s   llgll^llg-gQllQ     for  an  intermediate  value 
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g.     Then  by  boundedness  of  CCt.z.g^), 

(A.3)  |A(z,i;g)-A(z.i;gQ)|    £  "i"o^"Pt€[0  1]  zeZ   Kit,z.g)-(:(t,z,g^)\\  +  CIlill^llg-gQll^. 

'    '       e 

Also,     g(p(t,u),y-S(t,z,g),w)),     g  (p(t,u),y-S(t,z,g),w)),     and     g     (p(t,u),y-S(t,z,g).w)) 
are  all  bounded  for  uniformly  in     t  €  [0,1],  z  e  Z  ,     llg-g   II     <  e     emd     llg-g_IL  <  c,     as 
Eire  the  same  expressions  with     S(t,z,g)     replaced  by  a  value  in  between     S(t,z,g)     and 
S(t,z,g   ).     Also,   it  then  follows  by  mean-value  expansion  arguments  like  those  above, 
including  expansions  of  in     S(t,z,g)     around     S(t,z,g   ),     that  uniformly  in     llg-g   ll„  <  e, 

(^•^^  ^"Pt€[0.1],zeZ   IIC(t.z.g)-C(t.z.gQ)ll   ^  Cllg-g^l^. 

e 

For  example,  for  a  value     S(t,z,g,g   )     in  between     S(t,z,g)     and     S(t,z,g   ), 

llg  (p(t,u),y-S(t,z,g),w))  -  g     (p(t,u),y-S(t,z,gQ),w))ll   £  llg  (p(t,u),y-S(t,z,g),w))  - 

g     (p(t,u),y-S(t,z,g),w))ll   +   llg     (p(t,u),y-S(t,z,g),w))  -  g     (p(t,u),y-S(t,z,gQ),w))ll   s 

llg-go"i  +   llgQyy(p(t,u),y-S(t,z,g.gQ),w))l|.    IIS{t,z,g)-S(t,z,gQ)ll   £  Cllg-gQll^.     The  fourth 

conclusion  then  follows  by  eq.   (A.3). 

Finally,  to  show  the  first  conclusion,   let     D(t,z,g,g)  =  S(t,z,g)-S(t,z,g).     For 
notational  convenience,  suppress  the     t     and  z     arguments,  and  let     S  =  S(t,z,g)     and     S  = 
S(t,z,g).     Differencing  the  differential  equation  gives 

(A.5)  SD/at  =  -q(y-S,i)   +  q(y-S,g)  =  -[q(y-S,i)   -  q(y-S,g)]  -  [q(y-S,g)  -  q(y-S,g)] 

-  •(q(y-S,g)-q(y-S,g)  -  [q(y-S,g)  -  q(y-S,g)]} 

=  -C(g)Mg(y-S)-g(y-S)y  -  q^Cy-S.gjD  -  R(g,i), 

R(g,g)  =  [q(y-S,i)-q(y-S,g)-C(g)Mg(y-S)-g(y-S)>]  +  [q(y-S,g)-q(y-S,g)-qy(y-S,g)D] 

+  [q(y-S,g)-q(y-S,g)-q(y-S,g)+q(y-S,g)]  =  R^(g,g)  +  R2(g.g)  +  R3(g.g)- 

The  first  equation  here  is  an  inhomogeneous  linear  differential  equation,  with  final 
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condition     D|   _    =  0,     nonconstant  coefficient     -q  (y-S,g),     and  nonconstant  shift 
-?(g)'<g(y-S)-g(y-S)>  +  R(g,g).     Let     ^(t.z.g)  =  exp[-     q  (r,y-S(r,z.g).z.g)'Cdr]     ??. 

Then  the  solution  to  this  linear  equation  at     t  =  0     is 

(A.6)      DI^^Q  =  jJ[?(g)'{g(y-S)-g{y-S)>  +  R(g,i)]i^(t.z,g)dt  =  A(z,i-g)  +  jQR(g.i)?(t,z.g)dt. 


By     g(y-S)     and     g(y-S)     bounded  and     T       twice  continuously  differentiable,  the  elements 

2—2  — 

of     a  T(g(y-S))/5g       will  be  bounded  on     t  e  [0,1],  z  e  Z   ,  for  any     g     on  a  line  joining 

2  2 

g     and     g     (that  may  differ  from  element  to  element  of     5  T(g)/3g  ).     Then  by  a  mean-value 

expansion,  for  all     t  €  [0,1],  z  e  Z   , 

(A.7)  |R^(g,g)|    £  Cllp^lllia^T(i)/Sg^llllg(y-S)-g(y-S)ll^  :£  Clli-gll^. 


By      |S|    <  C+e     and      jSj    <  C+e,     q(y-s,g)  -  q(y-s,g)     is  differentiable  in  an  open  interval 

containing     S     and     S.     Let     S  =  S(t,z,g,g)     be  the  mean  value  for  an  expansion  of 

q(y-S,g)     around     S,     with     S     between     S     and     S,     so  that     y-S  €  V     and      jS-Sj    ^   |D|.     A 

similar  statement  holds  for  the  mean  value     S       of  an  expansion  of     q  (y-S,g)     around     S. 

Then  for  all     t  €  [0,1],  z  e  Z  , 

c 

(A.8)  |R„(g,g)|    i    Iq„(y-S,g)-q   (y-S,g)l|D|    <    Iq     (y-S*.g)||D|^  s  Clli-gll^. 

^  y  y  yy  '-' 

Similarly,  for  a  mecin-value  expansion  of     q(y-S,g)  -  q(y-S,g)     around     S,     for  all     t  e 

[0,1],     z  e  Z  , 
e 

(A.9)  jR^tg.i)!    s    |q^(y-S,i)-qy(y-S,g)|  IS-SI    ^  CIli-gll^lli-gllQ. 

where     g,      S,     and     S     denote  mean  values.     Then  combining  eqs.    (A.7)  -  (A.9)     and  noting 
that     ^(t,z,g)     is  bounded  uniformly  in     t  e  [0,1],     z  €  Z   ,     and     llg-g„ll   <  c,     we  have 
J'QR(g.g)€(t,z,g)dt  ^  Cllg-gll   llg-gll         so  the  first  conclusion  follows  by  eq.    (A.6).      QED. 

Proof  of  Theorem  1:     We  first  consider     S,     and  proceed  by  using  the  Lemmas  of  Section  5  of 
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Newey  (1992a)  (N  henceforth).     Let     f,  h^,     cind     h        denote  possible  values  for  the 
functions     f(x),     f(x)E[T    (q)|x]     and     f(x)E[w|x]     respectively,  and     h  =  (f,h  ,h    ).     Let 
g(z;p,h)  =  f(x)     [h^(x)-3'h    (x)]  +  P'w     and  for  any     h     let     L(x,h;/3,h)  = 
f(xr^[h^(x)-p'h^(x)]  -  f(xr^[h^(x)-p'h^(x)]f(x).     Let     p     and     h     in  N  equaUe.p' )'     and 

h     here.     Let     m(z,p,h)  =  (m  (z,P,h),m  (z.P.h)' )'      there  be     m  (z.^.h)  =  S(0,y  ,w    g(«;p,h)) 

-1        w 
-  e     and     m„(z,3,h)  =  T(x)[q-g(z;P,h)]®[w-f    (x)h    (x)].     For     A(z,g;g)     from  equation  (A. 2) 

let     D^(z,h;h,p)  =  A(z,L(',h;P,h);g(«;P,h))     and     D2(z,h;h,p)  = 

-T(x)f(x)"^q-g(z;p,h)]®[h^(x)-r^(x)h^(x)f(x)l  -  T(x)L(x.h;p,h)®lw-r\x)h'^(x)].     By 

Assumption  1  with     u(y)  =  l(y  =  y   )     it  follows  that     y^  e  V  .     Let     llgll  .     be  as  defined 

preceding  Assumption  1,  and     llhll .  =  sup        f^j   r     ~i\  n^.119  n(x)/5x  II.     Then  by  the  hypothesis 

that  the  density  of     x     is  bounded  away  from  zero  on     x(?'x[y,y]),     it  follows  by  a 

straightforward  application  of  the  quotient  rule  for  derivatives  that  if     llh-h   II .,      Ilh-h-ll ., 

and     lip-p_ll     are  small  enough  then     llg-gll .  ^  C llh-h II .     for     g  =  g(«:h,p)     and     g  =  g(«;h,p). 

Also,  it  follows  by  the  usual  mean  value  expansion  for  ratios  that  for  such     h     and     h, 

llg-g-L(',h-h;|3.h)ll      s  llh-hll    .     Then  by  the  conclusion  of  Lemma  1  and  by     llh-hll      £  llh-hll 

for     llh-h-^ll„     and     llh-h   II       small  enough, 

(A.IO)  I  m^(z,p,h)-m^(z,p,h)-D^(z,h-h;p,h)  I 

£    |A(z,g-i;i)-A(z,L(-,h-h;p,h),i)|    +  Cllg-illQllg-ill^ 

s  Cllg-g-L(«,h-h;P,h)ll      +  Cllh-hll    llh-hll     £  Cllh-hll    llh-hll  . 

It  also  follows  by  similar  reasoning  that  for     lip-p   II     and     lih-h   ll„     small  enough, 
(A.ll)  |D^(z,h;/3,h)|    =    |A(z,L(',h;/3,h);i)  I    s  CIIL(-,h;p,h)llQ  £  CIIHiIq, 

|D^(z,h;P,h)-D^{z,h;pQ,hQ)|    =  CllhllQllh-hQll^  +  Cllhll^(llh-hQllQ+llp-pQll), 

|m^(z,^,h)-m^(z,^Q,hQ)|   s  Cllh-hQil  +  011^-^^11. 
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It  also  follows  by  straightforward  algebra  that  for     lip-p   II,   llh-h^ll    ,     and     llh-h   II       small 
enough,  there  is     b(z)  =  C(l+llqll)     such  that 

(A.12)  llm2(z,p,h)-m2(z,p,h)-D2(z,h-h;p,h)ll  £  b(z)(llh-hQllQ)^,      |D2(z,h;p,h)  |   s  b(z)llhllQ. 

IID2(z.h;^.h)-D2(z.h;/3Q,hQ)ll  s  b(z)llhllQllh-hQllQ, 

Jlm^Cz./S.hl-m^Cz.pQ.hQJII  £  bCzKllh-h^llQ  +  IIP-PqII). 

4  i  k+2i    1/2     s  2 

Furthermore,     E[b(z)   ]  <  co     and  for     t)     =  [ln(n)/(n(r         )]       +o-       and     a  =  m/2,     we  have     tj 

n  n 

— >  0,     v'n(T7   )     — )  0,     Vncr  t)  t)     — >  0,     implying  that     l/(yno-  )  ^  ln(n)/(-/ncr  )  — ^  0. 
n  n  n 

Therefore,  the  hypotheses  of  Lemma  5.4  of  N  are  satisfied,  giving 

(A.13)  v/n(r°'£^"^[m^(z.,h.pQ)-m^(z.,hQ,pQ)]/n  =  ■/n(r°'£[m^(h)-m^(hQ)]  +  o  (1).     U  =  1,  2), 


for     a    =  a     and     a     =  0,     and     m.(h)  =  J'D„(z,h;h   ,P   )dF(z). 

Next,   by  hypothesis,     m   (h)  =  0.     Let     ^(t)  =  <(t(T),y   ,w   ,g   )     and     f   (t)  = 
KO^tCx)^!)  |St(T)/ST|      denote  the  density  of     t     when     t     is  uniformly  distributed  on 
[0,1].     By  the  inverse  function  theorem,     f   (t)     is  bounded  and  continuous  a.e.   with  compact 
support.     Then  by  the  definition  of     A     and     L, 

(A.14)  m^(h)  =  J"A(z,L(-,h;pQ,hQ);gQ)dF(z) 

=  JwCtlWxttDdt,     w(t)  =  fQ(x(T)r^C(T)'[-rQ(x(T)).I,-p^]f^(T). 


By  Assumptions  5  and  6,     cj(t)     is  bounded,   continuous  almost  everywhere,   and  zero  outside 
t([0,1]).     Also,   by  the  inverse  function  theorem  and  the  chain  rule,     x  (t)     is  continuously 

differentiable  with  bounded  derivatives  on     t(Z)),     a  compact  convex  set  containing     t([0,1]) 

2k  2s 

in  its  interior.     By  the  above  shown  conditions,     ncr       — >  m     and  no-       — >  0.     Then  it  follows 

by  Lemma  5.4  of  N  that     Vn'cr°'[m  (h)-m  (h    )]  -^  N(O.V    ).      It  then  follows  by  equation  (A.13), 

0-  — >  0,     and  the  triangle  inequality  that     Tncr  2^.m  (z.,p   ,h)/n     — >  N(0,V   ).     Furthermore, 
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by  equation  (A.13),     m^th)  =  0,     and     m^Ch^)  =  0     it  follows  that     V^cr"j;.m  (z.,p    h)/n  -^  0. 

Next,  note  that  by  Lemma  Al,     for     IIP-^qII     and     llh-h   ll„     small  enough,     m  (z.p.h)     is 
differentiable  in     p     with  derivative  that  is  bounded  uniformly  in     p     and     h,     and 
derivative  with  respect  to     9     equal  to     -1.     Also,     m„(z,3,h)     is  linear  in     p,     and  it  is 
straightforward  to  show  that     n    J]]._  9m  (z.,p,h)/ajS     converges  in  probability  to 
E[5m„(z,p^,h^)/S3],     with  a  first  column  of  zeros,  and  the  remaining  columns  being 
nonsingular  by  Assumption  2.     The  first  conclusion  then  follows  by  a  Taylor  expansion. 

To  show  the  second  conclusion  it  is  useful  to  verify  Assumption  5.2  of  N  for  both 

m      and     m„.     For     m  ,     parts  i)  -  ill)  of  Assumption  5.2  follow  by  eqs.   (A. 10)  -  (A.ll), 

with     A  =  0,  A,   =  A„  =  1,  and     A„  =  0.     Also,   part  iv)  is  satisfied  with     5„     =  1/2,     by 
1         z  o  pn 

3k+4  2s 

the  conditions  that     no-         /ln(n)  — >  oo     and     n<r       — >  0.     It  then  follows  by  Lemma  5.5  of 

N  that     (T^).   ,U./n  — >  V„     for     U.  =  0.   +  A.  -  >  .   ,A./n.     Also,   it  is  straightforward  to 
^1=1    1  0  1       ^1  J       ^j=l   J  ^ 

check  that  Assumption  5.2  of     N     is  satisfied  for     m„,     so  that  for     ip.  = 

T.(w.-E[w|x.])(T    (q.)-g(x.,w.))>,     cr     ).   Jli/».ll  /n  — ^  0.     Also,  by  arguments  similsir  to 
11  1    .  Ill  ^1=1     1 

those  for  asymptotic  normality,     G       is  bounded  in  probability  and     M  — ^  M,     so  the 
second  conclusion  follows  by  the  triangle  inequality.     QED. 

Proof  of  Theorem  2:     The  proof  proceeds  by  verifying  the  hypotheses  of  Theorem  A.l  of 

Newey  (1992b),  that  will  henceforth  be  referred  to  as     N.     Assumption  A.l  of  N  follows  by 

Assumption  7  and  boundedness  of     a(z,g   )     in  each  case.     Assumption  A.2  of  N  follows  by 

Assumption  8  here  and  Lemma  8.4  of  Newey  (1991),  with     11^    II  .  £  CK  *  '^,     j  =  0,  1,  2. 

Assumption  A. 3  of  N  follows  by  Assumption  9  here  and  Lemma  8.2  of  Newey  (1991),  with     a, 

equal  to  any  positive  number.     Assumption  A.  4  of  N,  with     A  =  2,  A    =  0,  A     =  A     =  1, 

follows  by  Lemma  Al  for  each  case.     Furthermore,  by     Kn       — >  oo     for  some     r  >  0,     it 

follows  that  for  any     F,,   F^  >,     there  are     a  ,     such  that     n  iK  zK     d  — >  0.     Therefore, 

12  d 

ll^^ll2lK^''^/Vn  +  K""2]  £  C'K^'K^^'^/Vn  +  o(l)  =  o(l). 
V^il^^llQJI^'^iyK^^^/v^  +  K'"o)(K^^^/v7r  +  K""i)  £  CK-K^-K/i/n  +  o(l)  =  o(l). 
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K^'''^\\4>^\\^/Vn  s  CK^^'^K^/i/n  =  o(l),     K^'^'^H^W^/Vn  £  CK^^V/'/n  =  o(l).     v^""o  =  o(l). 


so  that  the  rate  hypotheses  of  Theorem  A.l  of  N  are  satisfied.     Therefore,   it  suffices  to 
show  that  for     9  =  S     or     9  =  L,     Assumption  A. 6  or  A. 7  of  N  is  satisfied,  while  for     6  = 
fi     or     9  =  A,     Assumption  A. 5  of  N  is  satisfied. 

For     S,     the     A(z,g;g   )     from  N  is     J^tx)' [r{x(p(T),yQ-S(T))+w^p]dT.     Consider     g 
values  with     3  =  0.     It  follows  as  in  the  proof  of  Lemma  A.  4  of  N  that  Assumption  A. 6  of 
N  is  satisfied.     For     L,     x_  =  x(p  .y^)     and     T       =  T  (g  (x^.w   )), 

E[A(z.r(x);gQ)]  =  JCCt)' r(x(T))f^(T)dT  -  (pS°)'T  qFIx^). 

By     T  -^     nonsingular  and     p    *  p       there  exists     r     such  that     (p -p  )'T     r  *  0.     Also  by 
T     continuously  distributed  and     p(t)     one-to-one  on     (0,1),     Prob(x(T)=x   )  =  0. 
Therefore,  by  reasoning  similar  to  that  in  the  proof  of  Lemma  A. 4  of  N,  there  exists 
r.(x)     with     r.(x   )  =  r,     that  is  everywhere  continuous,  bounded  uniformly  in     j, 

converges  to  zero  for  all     x  *  x   ,     as     j  — >  oo,     and  hence     /^(x)' r(x(T))f   (x)dx  — >  0, 

K  K  2 

so  there  exists     r    (x)  =  $    (x)'t)        such  that     E[ll$    (x)'-n    II    ]  — >  0     and     E[A(z,r    (x);g   )] 
K.  K.  K.  K  U 

— >  -  (p  -p  )'T     r  *  0,     so  that  Assumption  A. 6  of  N  is  satisfied. 
For     y.,     A(z,g;g   )     in  N  is     u    (j(y)A(z,g;g   )     here,   so  that 

wE[A(z,g;gQ)]  =  E[u(y.)g(x.,w.)'C(t.,z.)]  =  E[E[tj(y.)C(t.,z.)'  |x.,w.]g(x..w.)] 

=  E[J'E[w(y.)C(t.,z.)'  |x.=x,w.=w]g(x,w)f(x|w)dx] 

=  E[/C(x,w)'g(x,w)f(x|w)dxl  =  E[C(x.,w.)'g(x.,w.)]  =  E[C(x.,w.)'g(x.,w.)], 


where  the  last  equality  follows  by  straightforward  calculation.     It  also  follows 
similarly  that  Assumption  A. 5  of  N  is  satisfied  for     X.     Then  the  conclusion  follows  by 
Theorem  A.l  of  N.     QED. 
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